HMM Content Model for TAC2010 Summarization Challenge
نویسندگان
چکیده
We present the HITS submission for the 2010 TAC Guided Summarization Task. We focus on the main multi-document summarization task, rather than the update task. We implement a baseline extractive summarization system from the literature (Barzilay and Lee, 2004) which uses a Hidden Markov Model to assign sentences content or topic labels, predicts which topics most likely appear in the summary, and constructs the summaries from these topics. We find that this model performs more poorly than expected, as compared to results shown in previous work. These differences may be attributed to the changes we made to the algorithm to accommodate the multi-document summarization task and the lack of human-annotated domains for the training data.
منابع مشابه
TMSP: Topic Guided Manifold Ranking with Sink Points for Guided Summarization
Guided summarization is an extension of query-focused multidocument summarization. We proposed a novel ranking algorithm, Topic Guided Manifold Ranking with Sink Points (TMSP) for guided summarization tasks of TAC2010. TMSP is a topic extended version of Manifold Ranking with Sink Points (MRSP), which handles the Update Summarization tasks of TAC2009 well. We adopt the TMSP and MRSP methods to ...
متن کاملCLASSY Query-Based Multi-Document Summarization
Our summarizer is based on an HMM (Hidden Markov Model) for sentence selection within a document and a pivoted QR algorithm to generate a multi-document summary. Each year, since we began participating in DUC in 2001, we have modified the features used by the HMM and have added linguistic capabilities in order to improve the summaries we generate. Our system, called “CLASSY” (Clustering, Lingui...
متن کاملAutomatic Segmentation and Summarization of Spoken Lectures
The ever-increasing number of online lectures has created an unprecedented opportunity for distance learning. Most online lectures are presented as unstructured text, audio and/or video files which make it di cult for students to locate relevant lectures and browse through them. In this thesis, we investigated several automatic lecture segmentation and summarization algorithms. Automatic lectur...
متن کاملExtractive Chinese Spoken Document Summarization Using Probabilistic Ranking Models
The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling fram...
متن کاملLearning to Model Domain-Specific Utterance Sequences for Extractive Summarization of Contact Center Dialogues
This paper proposes a novel extractive summarization method for contact center dialogues. We use a particular type of hidden Markov model (HMM) called Class Speaker HMM (CSHMM), which processes operator/caller utterance sequences of multiple domains simultaneously to model domain-specific utterance sequences and common (domainwide) sequences at the same time. We applied the CSHMM to call summar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010